Stretching of Proteins in the Entropic Limit 
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Mechanical stretching of six proteins is studied through molecular dynamics simulations. The 
model is Go-like, with Lennard- Jones interactions at native contacts. Low temperature unfolding 
scenarios are remarkably complex and sensitive to small structural changes. Thermal fluctuations 
reduce the peak forces and the number of metastable states during unfolding. The unfolding path- 
ways also simplify as temperature rises. In the entropic limit, all proteins show a monotonic decrease 
of the extension where bonds rupture with their separation along the backbone (contact order). 

PACS numbers: 87.15.La, 87.15. He, 87.15. Aa 



There is considerable current interest in the mechanical 
manipulation of single biological molecules. In particular, 
stretching studies of large proteins with atomic force mi- 
croscopes (AFM) and optical tweezers 0] reveal intri- 
cate, specific, and reproducible force (F) - displacement 
(d) curves that call for understanding and theoretical in- 
terpretation. The patterns depend on the pulling speed 
and on the stiffness of the pulling device 3]. They must 
also depend on the effective temperature given by the ra- 
tio of thermal to binding energies. In experiments this 
ratio can be varied slightly by changing temperature T, 
and over a large range by changing solvent properties 0, 
such as pH. While the role of effective temperature in 
folding is well studied |J, iJ| , its effect on mechanical un- 
folding is not. 

In this paper, we report results of molecular dynam- 
ics simulations of simplified models of six proteins that 
reveal universal trends with increases in the effective tem- 
perature. Thermal fluctuations accelerate rupture, lower 
the peaks in the F - d curves, and reduce the number 
of peaks. Most interestingly, the succession of unfolding 
events simplifies from a complex pattern determined by 
the energy landscape into a simple, uniform pattern de- 
termined by entropic considerations. All of these changes 
are gradual, and move to completion near the T where 
the specific heat peaks. In the entropic limit, ruptur- 
ing events are governed exclusively by the contact order, 
i.e. by the distance along a sequence between two amino 
acids which make a contact in the native state. The con- 
tact order is also believed to be the major influence in 
folding to the native state. However, there is no general 
correlation between folding and the extremely complex 
unfolding scenarios observed at low T 0]. 

The models we use are coarse-grained and Go- like Im. 
Full details are provided in earlier studies of folding 0,[3. 
Briefly, the amino acids are represented by point particles 
of mass m located at the positions of the C" atoms. They 
are tethered by a strong harmonic potential with a min- 
imum at the peptide bond length. The native structure 
of a protein is taken from the PDB data bank and the 
interactions between the amino acids are divided into na- 
tive and non-native contacts. The distinction is made by 



taking the fully atomic representation of the amino acids 
in the native state and then associating native contacts 
with overlapping amino acids. The criterion for overlaps 
uses the van der Waals radii of the atoms multiplied by 
1.24 to account for the softness of the potential 9]. 

The interaction between each pair of overlapping acids 
i and j is described with a 6-12 Lennard- Jones (LJ) po- 
tential whose interaction length aij (4.4 ~ 12.8 A) is cho- 
sen so that the potential energy minimum coincides with 
the native C" - C" distance. This forces the ground 
state to coincide with the native state at room T. The 
non-native contacts are described by a LJ potential with 
a = 5A that is truncated at r > to produce a 

purely repulsive force. An energy penalty is added to 
states with the wrong chirality as described in Ref. j^. 
This facilitates folding, but has little effect on mechan- 
ical stretching since the protein starts with the correct 
chirality. All of the potentials have a common energy 
scale e, which is taken as the unit of energy. Time is 
measured in terms of the usual LJ vibrational time scale 

The desired effective temperature T = ksT/e, where 
ks is Boltzmann's constant, is maintained by coupling 
each C" to a Langevin noise [lOj | and damping constant 
7. The value of 7 = 2m/ t is large enough to produce 
the overdamped dynamics appropriate for proteins in a 
solvent [3, but about 25 times smaller than the realis- 
tic damping from water. Previous studies show that this 
speeds the dynamics by about a factor 25 without al- 
tering behavior, and tests with larger 7 confirmed that 
it merely rescales the diffusion time in the overdamped 
regime [TJ. As a result the effective value of r in simula- 
tions is about 75 ps. 

Stretching is accomplished by attaching both ends 
of the protein to harmonic springs of spring constant 
k = 0.12e/A^ "3^. This corresponds to a cantilever stiff- 
ness k/2 ~0.2N/m, which is typical of an AFM. Stretch- 
ing is implemented parallel to the initial end-to-end posi- 
tion vector of the protein. The outer end of one spring is 
held fixed at the origin, and the outer end of the other is 
pulled at constant speed Vp = 0.005 A/r. Previous stud- 
ies at T = showed that decreasing the velocity below 



2 



T4 lysozymes soft 



f _ _ T=o.a 


/ 


1021 II 
(a) 






i'l ' 

lb6i ,7 

/ / 
(b) 


- P 


, , . 1 , . , 1 . . 


i / lb6i 
Jl (21-124) 

W 

(c) 

. 1 . . . 1 , . . 1 . , . 



I I I I I I I 

100 200 300 400 500 600 

d [A] 



FIG. 1: F-d curves for three lysozyme systems: (a) 1021, (b) 
lb6i pulled from its ends and (c) lb6i pulled by the cysteins 
at locations 21 and 124. Thick and thin solid lines correspond 
to r=0 and 0.2, respectively. The dashed line in (c) shows 
T = results for lb6i with the 1-20 and 125-163 amino acids 
removed. 

this value had little eflFect on unfolding This veloc- 
ity corresponds to about 7 x 10®nm/s. Velocities in all 
atom simulations are more than three orders of magni- 
tude fasterT^, but experimental AFM velocities are 1000 
times slower. Closing this gap remains a formidable chal- 
lenge. Varying the simulation temperature is one way 
to accelerate experimental dynamics into an accessible 
range. 

The displacement of the pulled end of the spring is 
denoted by d. The net force stretching the protein is 
denoted by F, and measured from the extension of the 
pulling spring. Except at T = 0, where the results are 
strictly reproducible, F is averaged over a displacement 
of o.sA to reduce thermal noise without substantially 
affecting spatial resolution. A contact between amino 
acids i and j is considered ruptured if the distance be- 
tween them exceeds 1.5 aij. The unfolding scenario is 
specified by the unbinding or breaking distance du for 
each native contact. Note that at finite T the contact 
may break and reform several times and du is associated 
with the final rupture. 

To illustrate the sensitivity of low T unfolding to 
small structural changes, we consider bacteriophage T4 
lysozymes. There are two mutant structures with se- 
quence length N =163 whose PDB codes are 1021 and 
lb6i. The latter has cysteins in locations 21 and 124 in- 
stead of threonine and lysine respectively. Although the 
root mean square deviation between the two structures is 
merely 0.3 A, there are noticeable differences in the T=0 
simulations of stretching in Fig. ^ (thick lines). The force 
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FIG. 2: Breaking distances for bonds in 1021 plotted against 
corresponding values for lb6i at the indicated T. 



curves show a series of upward ramps where the protein 
is stuck in a metastable state j^J , followed by sharp drops 
as one or more contacts break, allowing the intervening 
segment to stretch. The differences between results for 
the two lysozymes, especially around d of 200, 375, and 
475 A, indicate different sets of broken contacts. Fig. |21 
compares the rupturing distances of each native contact 
in the two mutants. If the mutants followed the same 
pattern, the points would lie on a line with unit slope. 
However, they clearly follow different patterns at T = 0. 
Sensitivity oi F — d curves to point mutations has been 
demonstrated in recent experiments on an immunoglob- 
ulin module in human cardiac titin [T]| . 

Figure ^ (c) shows the importance of the location of 
the pulling force. Here the pulling springs were attached 
to the cysteins at j=21 and 124 of lb6i, shortening the 
effective sequence length to 104 amino acids. Yang et al. 

have used an AFM to study a string of lb6i proteins 
bound covalently at these sites, and observe a series of 
equally spaced peaks that indicates the repeated units 
unfold sequentially. As shown in our earlier work 0|, 
sequential unfolding occurs when the largest force peak 
breaks the first contacts in a repeat unit. Pulling from 
the z=21 and 124 sites produces a strong peak near the 
start of unfolding, which is consistent with the sequential 
unfolding observed by Yang et al.. This peak is absent for 
the full lysozymes and for the sequence from z=21 to 124 
with amino acids 1 — 20 and 125 — 163 removed (dashed 
line). We would thus predict repeated arrays of the full 
or truncated sequences would unfold simultaneously if 
linked at their ends. It would be interesting to test these 
predictions with experiments. Note that recent exper- 
iments on another protein, E21ip3, have demonstrated 
that F — d patterns can depend strongly on the pulling 
geometry, particularly the direction of the force relative 
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FIG. 3: Contact breaking distances for 1021 vs. contact order 
at the indicated T. 



to key native contacts |,12|. 

We now turn to the effect of temperature on the force 
and unfolding sequence. Increasing T to 0.2 (thin lines in 
Fig. ^ produces similar changes in the force curves for all 
lysozymes. Thermal activation reduces the force needed 
to rupture bonds, shifting the entire force curve down- 
wards. Some peaks disappear, indicating that the states 
are no longer metastable at this stress and T. Studies 
of the largest peaks show that they decrease roughly lin- 
early with r, and shift to smaller du- For T > 0.6 no 
maxima can be identified and the curves app roach the 
entropic limit of a worm-like-chain (WLC) at higher 
f . 

The force curves and unfolding scenarios (Fig. ^ of 
the two mutant structures become more similar with in- 
creasing T. Similar plots of the unfolding sequences for 
stiff vs. soft pulling springs also show considerable dif- 
ferences at low T that disappear as T rises. Fig. Oshows 
that the sequence of unfolding distances also simplifies 
dramatically with increasing T. Values of c?„ for each 
bond in 1021 are plotted against the bond's contact or- 
der \j — i\. The T — unravelling scenario is complex: 
Low contact order bonds break over the entire range of 
dm while middle and long-range bonds break in clusters 
of events at a few o?„. At T ~ 0.2, the events collapse 
onto a smaller set of lines. By T — 0.8, events have 
nearly collapsed onto a monotonically decreasing curve, 
that sharpens with further increases in T. 

The simplification in unfolding sequence is even more 
dramatic for the tandem arrangement of three 127 do- 
mains of titin shown in Fig. 21 At T—0, there is a se- 
rial unwinding of the individual domains that produces 
a repeated pattern. One domain unfolds from d„ = 
to 300A, the next from 300 to 600A, and the last from 
600 to 900A. The three sequences can be overlapped by 



FIG. 4: Unravelling of three serial domains of titin on the 
du — \j — i\ plane. Squares (stars) correspond to the most 
forward (backward) domain. 




FIG. 5: Unfolding in the high T limit for Icrn (crambin; 
A''=46), Itit (the 127 domain of titin; N =89), Iquu (actinin; 
TV =248), 1021 (lysozyme; iV =163), 9a2p (barnase; =108), 
and licx (yellow lupin protein 10; N =155). Both axes scaled 
by A'^ and du is divided by the peptide bond length of 3.8Ato 
make it dimensionless. For each protein the value of T needed 
to reach the entropic limit was near Tmax. These values were 
0.6, 0.8, 1.2, 1.4, 0.8, and 1.2 respectively. 

a vertical displacement. By T — 0.8, unfolding occurs 
simultaneously and du has collapsed onto a nearly mono- 
tonic curve similar to that found in Fig. 13 

We have examined the T-dependence of unfolding for a 
large number of proteins, including periodically repeated 
domains. In all cases, increasing T produces a gradual 
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reduction in the size and number of force peaks and an 
increasingly universal relation between c?„ and contact or- 
der. These changes saturate near the temperature Tmax 
where each protein has a maximum in the equilibrium 
specific heat. Fig. [S] compares the normalized unfolding 
curves for six proteins at temperatures near their Tmax- 
For each protein, c?„ decreases nearly monotonically at 
this T. The curves are qualitatively similar, but results 
for shorter proteins tend to lie above those for longer pro- 
teins. Studies of the rate dependence show that this is 
because longer proteins take more time to sample config- 
urations and thus are less likely to reform contacts with 
large \j — i\/N . The curves can be made more universal 
by choosing different pulling rates for each protein. Stud- 
ies with artificial proteins (homopolymers) , where every 
contact is native, indicate a logarithmic rate dependence 
with the normalized curves moving gradually up and to 
the right. 

It is not surprising that Tmax is the temperature where 
unfolding simplifies. It is where the entropy of the sys- 
tem is changing most rapidly, and thus where binding 
energies are becoming less important. Not surprisingly, 
the thermal energy at Tmax is always comparable to the 
contact energy e. Other characteristic temperatures for 
folding are substantially lower, and do not correlate well 
with the simplification of the unfolding sequence. For 
example, the temperature where folding is fastest, Tmin, 
is 0.35 for our model of 1021 and the temperature where 
the protein spends half of its time in the unfolded state is 
0.25. At both of these T's the contact dependence of the 



rupture process is still structured and non-monotonic, as 
illustrated in Fig. El 

In summary, thermal fluctuations affect the force - 
displacement curves in a profound manner, reducing 
force peaks and the number of metastable configurations. 
Yang et al. 0| have observed the decrease in unbinding 
force with increasing T by modifying the solvent to re- 
duce e. However, as in our simulations for their tandem 
lysozyme system, domains unfold serially and there is 
little structure except for the initial force peak. Stud- 
ies of systems that unfold serially would exhibit richer 
changes with T and we predict this should be observed 
for lysozymes joined at their ends. 

Raising T produces a dramatic simplification of the 
unfolding sequence that culminates in a monotonic drop 
of du with contact order. In the entropic limit, T > Tmax, 
the tension in the chain is consistent with the WLC 
model 13]. However, this model is often used to de- 
scribe unfolding at temperatures where folding is rapid. 
In this regime we find substantial deviations from the 
WLC model, which may affect the interpretation of ex- 
perimental data. It would be desirable to repeat our 
studies with detailed atomistic potentials like those used 
for titin but such studies remain too computation- 
ally intensive for a thorough scan of parameter space. 
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Sienkiewicz. 
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